Skip to content

feat(operator): add per-tenant namespace RBAC for multi-tenant job execution#656

Merged
ruivieira merged 2 commits intotrustyai-explainability:dev/evalhub-multi-tenancyfrom
ruivieira:dev/evalhub-multi-tenancy/sa
Mar 3, 2026
Merged

feat(operator): add per-tenant namespace RBAC for multi-tenant job execution#656
ruivieira merged 2 commits intotrustyai-explainability:dev/evalhub-multi-tenancyfrom
ruivieira:dev/evalhub-multi-tenancy/sa

Conversation

@ruivieira
Copy link
Member

@ruivieira ruivieira commented Mar 3, 2026

Summary

  • Adds per-tenant namespace RBAC provisioning to the EvalHub controller, enabling multi-tenant job execution
  • Controller now watches for namespaces annotated with mlflow.kubeflow.org/workspace-description and automatically creates a jobs ServiceAccount + 3 RoleBindings (jobs-writer, job-config, mlflow-access) in each tenant namespace
  • Uses label-based cleanup (trustyai.opendatahub.io/managed-by) instead of owner references (cross-namespace owner refs are forbidden in Kubernetes)
  • Tenant RBAC is non-blocking: failures log a warning event but don't prevent the EvalHub deployment from proceeding
  • Cleanup runs both on annotation removal (stale resource GC) and on EvalHub CR deletion

Changes

  • controllers/evalhub/constants.go: Added tenantAnnotation and tenantLabel constants
  • controllers/evalhub/evalhub_controller.go: Namespace watch via Watches(), RBAC marker for namespace read access, reconcileTenantNamespaces() call in reconcile loop, cleanupTenantResources() call in deletion handler, namespacesToEvalHub mapper
  • controllers/evalhub/service_accounts.go: reconcileTenantNamespaces(), reconcileTenantNamespace(), ensureTenantServiceAccount(), ensureTenantRoleBinding(), cleanupTenantResources()
  • controllers/evalhub/tenant_rbac_test.go: 6 unit tests (resource creation, idempotency, control-plane skip, annotation-removal cleanup, deletion cleanup, multi-namespace handling)

Unchanged:

  • API SA, its Roles/RoleBindings, auth reviewer CRB
  • Jobs SA in control-plane namespace (separate PR)
  • Deployment, Service, ConfigMaps, Route
  • EvalHub service code (X-Tenant header handling) (separate PR)

Summary by CodeRabbit

Release Notes

  • New Features

    • Added multi-tenant namespace RBAC support with automatic provisioning and lifecycle management of tenant-scoped access controls across namespaces.
    • Enabled automatic detection and reconciliation of tenant namespace configuration changes.
  • Tests

    • Added comprehensive test suite for tenant access control reconciliation logic.

@ruivieira ruivieira self-assigned this Mar 3, 2026
@ruivieira ruivieira added the kind/enhancement New feature or request label Mar 3, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
controllers/evalhub/service_accounts.go (2)

894-915: Tenant SA labels are not updated if they drift.

The function returns early if the ServiceAccount exists without checking whether its labels match the desired state. If labels are manually modified or become inconsistent, subsequent cleanup operations could fail to identify managed resources.

♻️ Proposed refactor to ensure label consistency
 func (r *EvalHubReconciler) ensureTenantServiceAccount(ctx context.Context, name, namespace string, labels map[string]string) error {
 	sa := &corev1.ServiceAccount{}
 	err := r.Get(ctx, types.NamespacedName{Name: name, Namespace: namespace}, sa)
 	if err == nil {
-		return nil // already exists
+		// Ensure labels are up-to-date
+		needsUpdate := false
+		if sa.Labels == nil {
+			sa.Labels = make(map[string]string)
+		}
+		for k, v := range labels {
+			if sa.Labels[k] != v {
+				sa.Labels[k] = v
+				needsUpdate = true
+			}
+		}
+		if needsUpdate {
+			log.FromContext(ctx).Info("Updating tenant SA labels", "namespace", namespace, "name", name)
+			return r.Update(ctx, sa)
+		}
+		return nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/service_accounts.go` around lines 894 - 915,
ensureTenantServiceAccount currently returns early if the ServiceAccount exists
and does not reconcile labels; change it to fetch the existing SA (using r.Get
in ensureTenantServiceAccount), compare its ObjectMeta.Labels with the desired
labels passed in, and if they differ replace/merge as appropriate and call
r.Update(ctx, sa) (or a patch) to persist the label changes; keep the not-found
path creating the SA as-is and ensure errors from Update are returned.

782-788: Consider continuing reconciliation for remaining namespaces on partial failure.

When reconcileTenantNamespace fails for one namespace, the function returns immediately without attempting to reconcile other tenant namespaces. This means a single problematic namespace blocks RBAC provisioning for all other tenants. Consider accumulating errors and continuing, or at minimum documenting this behavior.

♻️ Proposed refactor to continue on partial failure
+	var reconcileErr error
 	// Reconcile each tenant namespace
 	for ns := range tenantNS {
 		if err := r.reconcileTenantNamespace(ctx, instance, ns); err != nil {
 			log.Error(err, "Failed to reconcile tenant namespace", "namespace", ns)
-			return fmt.Errorf("reconciling tenant namespace %s: %w", ns, err)
+			reconcileErr = fmt.Errorf("reconciling tenant namespace %s: %w", ns, err)
+			// Continue with other namespaces
 		}
 	}
+
+	// Return first error after attempting all namespaces
+	if reconcileErr != nil {
+		return reconcileErr
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/service_accounts.go` around lines 782 - 788, The loop
over tenantNS currently returns on the first error from
reconcileTenantNamespace, blocking reconciliation for remaining namespaces;
change this to continue reconciling all namespaces while collecting errors
(e.g., append errors to a slice or use a multierror) and log each failure with
log.Error including the namespace, then after the loop return a composed error
if any failures occurred (or nil if none). Specifically update the loop that
calls r.reconcileTenantNamespace(ctx, instance, ns) to not return immediately on
error, accumulate per-namespace errors, and return an aggregated error (or
wrapped error list) at the end so callers can see partial-failure details.
controllers/evalhub/tenant_rbac_test.go (1)

252-265: Consider using assert.Error for clearer test assertions.

Using assert.True(t, err != nil, ...) is less idiomatic than assert.Error(t, err, ...) for verifying that an error occurred.

♻️ Proposed refactor
 	// Verify SA was cleaned up
 	err = fakeClient.Get(ctx, types.NamespacedName{
 		Name:      evalHubName + "-jobs",
 		Namespace: tenantNS,
 	}, sa)
-	assert.True(t, err != nil, "expected SA to be deleted after annotation removal")
+	assert.Error(t, err, "expected SA to be deleted after annotation removal")

 	// Verify RoleBindings were cleaned up
 	rb := &rbacv1.RoleBinding{}
 	err = fakeClient.Get(ctx, types.NamespacedName{
 		Name:      evalHubName + "-tenant-jobs-writer",
 		Namespace: tenantNS,
 	}, rb)
-	assert.True(t, err != nil, "expected RoleBinding to be deleted after annotation removal")
+	assert.Error(t, err, "expected RoleBinding to be deleted after annotation removal")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/tenant_rbac_test.go` around lines 252 - 265, Replace the
non-idiomatic nil-check assertions using assert.True(t, err != nil, ...) with
assert.Error to make the tests clearer: for the ServiceAccount check (after
fakeClient.Get into sa for Name evalHubName + "-jobs" in namespace tenantNS) use
assert.Error(t, err, "expected SA to be deleted after annotation removal"), and
for the RoleBinding check (fakeClient.Get into rb for Name evalHubName +
"-tenant-jobs-writer" in namespace tenantNS) use assert.Error(t, err, "expected
RoleBinding to be deleted after annotation removal").
controllers/evalhub/evalhub_controller.go (1)

212-214: Consider adding a predicate to filter namespace events.

The current implementation triggers reconciliation of all EvalHub instances on every namespace event (create, update, delete). In clusters with many namespaces and frequent namespace activity, this could cause unnecessary reconciliation churn.

♻️ Proposed refactor to filter on annotation changes
+import "sigs.k8s.io/controller-runtime/pkg/predicate"
+
+// tenantAnnotationPredicate filters namespace events to only those
+// where the tenant annotation changes (added, removed, or modified).
+func tenantAnnotationPredicate() predicate.Predicate {
+	return predicate.Funcs{
+		CreateFunc: func(e event.CreateEvent) bool {
+			_, ok := e.Object.GetAnnotations()[tenantAnnotation]
+			return ok
+		},
+		UpdateFunc: func(e event.UpdateEvent) bool {
+			_, oldHas := e.ObjectOld.GetAnnotations()[tenantAnnotation]
+			_, newHas := e.ObjectNew.GetAnnotations()[tenantAnnotation]
+			return oldHas != newHas
+		},
+		DeleteFunc: func(e event.DeleteEvent) bool {
+			_, ok := e.Object.GetAnnotations()[tenantAnnotation]
+			return ok
+		},
+	}
+}

Then use it in the watch:

 		Watches(&corev1.Namespace{},
 			handler.EnqueueRequestsFromMapFunc(r.namespacesToEvalHub),
+			builder.WithPredicates(tenantAnnotationPredicate()),
 		).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/evalhub_controller.go` around lines 212 - 214, The Watch
on corev1.Namespace currently enqueues all namespace events via
r.namespacesToEvalHub; add a predicate to only enqueue when namespace
annotations relevant to EvalHub change (or when a namespace gains/loses the
annotation) to avoid reconciliation churn. Implement a predicate (e.g.,
namespaceAnnotationPredicate) using predicate.Funcs that returns false for
Create/Delete/Update unless the annotation set/value you care about differs
between old and new (for Update compare Old.GetAnnotations() vs
New.GetAnnotations(), for Create/Delete check presence of the annotation), then
attach it to the watch call (replace the current Watches(...). with
Watches(&corev1.Namespace{},
handler.EnqueueRequestsFromMapFunc(r.namespacesToEvalHub),
builder.WithPredicates(namespaceAnnotationPredicate)). Ensure you reference the
same r.namespacesToEvalHub mapping function and use the new predicate variable
in the builder.WithPredicates call.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@controllers/evalhub/evalhub_controller.go`:
- Around line 212-214: The Watch on corev1.Namespace currently enqueues all
namespace events via r.namespacesToEvalHub; add a predicate to only enqueue when
namespace annotations relevant to EvalHub change (or when a namespace
gains/loses the annotation) to avoid reconciliation churn. Implement a predicate
(e.g., namespaceAnnotationPredicate) using predicate.Funcs that returns false
for Create/Delete/Update unless the annotation set/value you care about differs
between old and new (for Update compare Old.GetAnnotations() vs
New.GetAnnotations(), for Create/Delete check presence of the annotation), then
attach it to the watch call (replace the current Watches(...). with
Watches(&corev1.Namespace{},
handler.EnqueueRequestsFromMapFunc(r.namespacesToEvalHub),
builder.WithPredicates(namespaceAnnotationPredicate)). Ensure you reference the
same r.namespacesToEvalHub mapping function and use the new predicate variable
in the builder.WithPredicates call.

In `@controllers/evalhub/service_accounts.go`:
- Around line 894-915: ensureTenantServiceAccount currently returns early if the
ServiceAccount exists and does not reconcile labels; change it to fetch the
existing SA (using r.Get in ensureTenantServiceAccount), compare its
ObjectMeta.Labels with the desired labels passed in, and if they differ
replace/merge as appropriate and call r.Update(ctx, sa) (or a patch) to persist
the label changes; keep the not-found path creating the SA as-is and ensure
errors from Update are returned.
- Around line 782-788: The loop over tenantNS currently returns on the first
error from reconcileTenantNamespace, blocking reconciliation for remaining
namespaces; change this to continue reconciling all namespaces while collecting
errors (e.g., append errors to a slice or use a multierror) and log each failure
with log.Error including the namespace, then after the loop return a composed
error if any failures occurred (or nil if none). Specifically update the loop
that calls r.reconcileTenantNamespace(ctx, instance, ns) to not return
immediately on error, accumulate per-namespace errors, and return an aggregated
error (or wrapped error list) at the end so callers can see partial-failure
details.

In `@controllers/evalhub/tenant_rbac_test.go`:
- Around line 252-265: Replace the non-idiomatic nil-check assertions using
assert.True(t, err != nil, ...) with assert.Error to make the tests clearer: for
the ServiceAccount check (after fakeClient.Get into sa for Name evalHubName +
"-jobs" in namespace tenantNS) use assert.Error(t, err, "expected SA to be
deleted after annotation removal"), and for the RoleBinding check
(fakeClient.Get into rb for Name evalHubName + "-tenant-jobs-writer" in
namespace tenantNS) use assert.Error(t, err, "expected RoleBinding to be deleted
after annotation removal").

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fd576a and dc12592.

📒 Files selected for processing (5)
  • config/rbac/role.yaml
  • controllers/evalhub/constants.go
  • controllers/evalhub/evalhub_controller.go
  • controllers/evalhub/service_accounts.go
  • controllers/evalhub/tenant_rbac_test.go

}

// 2. RoleBinding: API SA → jobs-writer ClusterRole (create/delete jobs in tenant ns)
if err := r.ensureTenantRoleBinding(ctx, instance.Name+"-tenant-jobs-writer", namespace, managedLabels,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are going with the option with Role and RoleBinding for EH service accounts instead of ClusterRole and ClusterRoleBinding ? I am OK with this, just wanted to confirm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariusdanciu Yes. The tenant-namespace resources use RoleBindings that reference pre-existing ClusterRoles (e.g. evalhub-jobs-writer, evalhub-job-config, evalhub-mlflow-access).

My idea was mostly to avoid creating additional ClusterRoleBindings per tenant since with RoleBindings the API SA can only create jobs in namespaces the operator has explicitly provisioned.

log.Info("Reconciling tenant namespace RBAC", "namespace", namespace)

apiSAName := generateServiceAccountName(instance)
jobsSAName := generateJobsServiceAccountName(instance)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generateJobsServiceAccountName adds the -api suffix. For jobs SA we should probably have -job suffix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariusdanciu thanks, I'll check again. I thought the two separate functions created

  • generateServiceAccountName(): {name}-api (the API/proxy SA)
  • generateJobsServiceAccountName(): {name}-jobs (the jobs SA)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have misread the diff.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
controllers/evalhub/tenant_rbac_test.go (1)

255-267: Use explicit NotFound assertions for cleanup checks.

Using err != nil can mask unexpected failures; asserting IsNotFound makes cleanup expectations precise.

🔍 Suggested test assertion refinement
+import apierrors "k8s.io/apimachinery/pkg/api/errors"
@@
-assert.True(t, err != nil, "expected SA to be deleted after annotation removal")
+assert.True(t, apierrors.IsNotFound(err), "expected SA to be deleted after annotation removal")
@@
-assert.True(t, err != nil, "expected RoleBinding to be deleted after annotation removal")
+assert.True(t, apierrors.IsNotFound(err), "expected RoleBinding to be deleted after annotation removal")

Also applies to: 329-342, 404-409

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/tenant_rbac_test.go` around lines 255 - 267, Replace the
loose "err != nil" assertions used after fakeClient.Get for the service account
(sa) and RoleBinding (rb) cleanup checks with explicit NotFound checks using k8s
API errors (apierrors.IsNotFound(err)); update the assertions around the Get
calls (the ones referencing evalHubName+"-jobs" and
evalHubName+"-tenant-jobs-writer") to assert that apierrors.IsNotFound(err) is
true and include a clear message, and import k8s.io/apimachinery/pkg/api/errors
as apierrors if not already present; apply the same change to the other similar
assertions mentioned (around lines 329-342 and 404-409).
controllers/evalhub/evalhub_controller.go (1)

212-214: Consider filtering namespace-triggered reconciles to tenant-relevant changes only.

Current watch fan-outs all EvalHub reconciles for any namespace update, which can become noisy at scale. Limiting triggers to tenant annotation add/remove/change (plus delete) will reduce unnecessary churn.

Also applies to: 221-240

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/evalhub_controller.go` around lines 212 - 214, The
namespace Watch currently enqueues all EvalHub reconciles for any Namespace
change; restrict it by adding a predicate (predicate.Funcs) to the Watches on
corev1.Namespace that only returns true for Create/Delete or Update events where
the specific tenant annotation key is added/removed/changed (compare old/new
annotations for the tenant key), and keep using
handler.EnqueueRequestsFromMapFunc(r.namespacesToEvalHub); apply the same
predicate-based filtering to the other Watches in the same block (the ones
referenced around lines 221-240) so only tenant-relevant namespace changes
trigger reconciles.
controllers/evalhub/service_accounts.go (2)

900-905: Reconcile labels/annotations on existing tenant resources to prevent orphaned RBAC.

For pre-existing objects, metadata drift is not corrected. Since cleanup relies on tenantLabel, missing labels can leave stale RBAC after annotation removal or instance deletion.

Also applies to: 947-963

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/service_accounts.go` around lines 900 - 905, When an
existing tenant resource is found (e.g., in ensureTenantServiceAccount) its
metadata should be reconciled instead of returning early; merge/patch the
provided labels and annotations into the existing object's ObjectMeta
(preserving other keys) and call Update (or Patch) to persist changes so
tenantLabel and annotations are never missing; apply the same pattern to the
other tenant reconciliation functions referenced (the code block around lines
947-963) so pre-existing Role/RoleBinding/ServiceAccount objects get their
labels and annotations reconciled rather than skipped.

773-779: Filter out non-tenant control-plane namespaces by annotation value.

Current logic treats any namespace with tenantAnnotation as tenant. If control-plane namespaces also carry that key (e.g., "Control Plane"), tenant RBAC can be provisioned there unintentionally.

✅ Suggested tenant namespace filter
- if _, ok := ns.Annotations[tenantAnnotation]; ok {
-   tenantNS[ns.Name] = true
- }
+ if desc, ok := ns.Annotations[tenantAnnotation]; ok {
+   if strings.EqualFold(strings.TrimSpace(desc), "Control Plane") {
+     continue
+   }
+   tenantNS[ns.Name] = true
+ }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/service_accounts.go` around lines 773 - 779, The loop
currently treats any namespace with the key tenantAnnotation as a tenant; change
it to also validate the annotation value before marking tenantNS[ns.Name] = true
(e.g., read value := ns.Annotations[tenantAnnotation] and only set tenantNS when
value equals the canonical tenant marker like "tenant" or a configurable
expectedTenantAnnotationValue), and skip namespaces where the annotation value
indicates control-plane (e.g., "Control Plane") or is empty; update the code
around nsList, ns.Annotations, tenantAnnotation, tenantNS and instance.Namespace
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@controllers/evalhub/service_accounts.go`:
- Around line 791-792: The tenant ownership selector uses only instance.Name
(managedLabel := client.MatchingLabels{tenantLabel: instance.Name}) which can
collide across namespaces; change the label value to include the namespace
(e.g., combine instance.Namespace and instance.Name or use
instance.GetNamespace()) so the selector is unique per EvalHub CR, and update
the other occurrences of tenantLabel usage (the other managedLabel/selector
constructions around the references noted) to use the same namespaced label
format so RBAC cleanup/update won't cross namespaces.

---

Nitpick comments:
In `@controllers/evalhub/evalhub_controller.go`:
- Around line 212-214: The namespace Watch currently enqueues all EvalHub
reconciles for any Namespace change; restrict it by adding a predicate
(predicate.Funcs) to the Watches on corev1.Namespace that only returns true for
Create/Delete or Update events where the specific tenant annotation key is
added/removed/changed (compare old/new annotations for the tenant key), and keep
using handler.EnqueueRequestsFromMapFunc(r.namespacesToEvalHub); apply the same
predicate-based filtering to the other Watches in the same block (the ones
referenced around lines 221-240) so only tenant-relevant namespace changes
trigger reconciles.

In `@controllers/evalhub/service_accounts.go`:
- Around line 900-905: When an existing tenant resource is found (e.g., in
ensureTenantServiceAccount) its metadata should be reconciled instead of
returning early; merge/patch the provided labels and annotations into the
existing object's ObjectMeta (preserving other keys) and call Update (or Patch)
to persist changes so tenantLabel and annotations are never missing; apply the
same pattern to the other tenant reconciliation functions referenced (the code
block around lines 947-963) so pre-existing Role/RoleBinding/ServiceAccount
objects get their labels and annotations reconciled rather than skipped.
- Around line 773-779: The loop currently treats any namespace with the key
tenantAnnotation as a tenant; change it to also validate the annotation value
before marking tenantNS[ns.Name] = true (e.g., read value :=
ns.Annotations[tenantAnnotation] and only set tenantNS when value equals the
canonical tenant marker like "tenant" or a configurable
expectedTenantAnnotationValue), and skip namespaces where the annotation value
indicates control-plane (e.g., "Control Plane") or is empty; update the code
around nsList, ns.Annotations, tenantAnnotation, tenantNS and instance.Namespace
accordingly.

In `@controllers/evalhub/tenant_rbac_test.go`:
- Around line 255-267: Replace the loose "err != nil" assertions used after
fakeClient.Get for the service account (sa) and RoleBinding (rb) cleanup checks
with explicit NotFound checks using k8s API errors (apierrors.IsNotFound(err));
update the assertions around the Get calls (the ones referencing
evalHubName+"-jobs" and evalHubName+"-tenant-jobs-writer") to assert that
apierrors.IsNotFound(err) is true and include a clear message, and import
k8s.io/apimachinery/pkg/api/errors as apierrors if not already present; apply
the same change to the other similar assertions mentioned (around lines 329-342
and 404-409).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fd576a and 8ea9e73.

📒 Files selected for processing (5)
  • config/rbac/role.yaml
  • controllers/evalhub/constants.go
  • controllers/evalhub/evalhub_controller.go
  • controllers/evalhub/service_accounts.go
  • controllers/evalhub/tenant_rbac_test.go

Comment on lines +791 to +792
managedLabel := client.MatchingLabels{tenantLabel: instance.Name}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tenant ownership selector is not unique enough across namespaces.

Using only instance.Name for tenantLabel allows collisions when two EvalHub CRs share a name in different namespaces, which can cause cross-instance update/cleanup of tenant RBAC.

🛠️ Suggested ownership key hardening
@@
- managedLabel := client.MatchingLabels{tenantLabel: instance.Name}
+ ownerKey := normalizeDNS1123LabelValue(instance.Namespace + "-" + instance.Name)
+ managedLabel := client.MatchingLabels{tenantLabel: ownerKey}
@@
- managedLabels := map[string]string{
-   tenantLabel:                  instance.Name,
+ ownerKey := normalizeDNS1123LabelValue(instance.Namespace + "-" + instance.Name)
+ managedLabels := map[string]string{
+   tenantLabel:                  ownerKey,
@@
- managedLabel := client.MatchingLabels{tenantLabel: instance.Name}
+ ownerKey := normalizeDNS1123LabelValue(instance.Namespace + "-" + instance.Name)
+ managedLabel := client.MatchingLabels{tenantLabel: ownerKey}

Also applies to: 836-838, 974-975

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@controllers/evalhub/service_accounts.go` around lines 791 - 792, The tenant
ownership selector uses only instance.Name (managedLabel :=
client.MatchingLabels{tenantLabel: instance.Name}) which can collide across
namespaces; change the label value to include the namespace (e.g., combine
instance.Namespace and instance.Name or use instance.GetNamespace()) so the
selector is unique per EvalHub CR, and update the other occurrences of
tenantLabel usage (the other managedLabel/selector constructions around the
references noted) to use the same namespaced label format so RBAC cleanup/update
won't cross namespaces.

Copy link

@mariusdanciu mariusdanciu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections.

@openshift-ci
Copy link

openshift-ci bot commented Mar 3, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mariusdanciu

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ruivieira ruivieira merged commit 2006c90 into trustyai-explainability:dev/evalhub-multi-tenancy Mar 3, 2026
3 checks passed
ruivieira added a commit to ruivieira/trustyai-service-operator that referenced this pull request Mar 4, 2026
…ecution (trustyai-explainability#656)

* feat(operator): add per-tenant namespace RBAC for multi-tenant job execution

* feat(operator): annotate tenant RBAC resources with owner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement New feature or request lgtm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants